In the beginning of my corpus I decided to split Miles’ entire oeuvre into 6 distinct categories, based on the kind of music he was making at the time and the sidemen he surrounded him with (unfortunately very little sidewomen), but I never checked whether the division into these categories was justified. With the arrival of prediction techniques in this course, I now have the tools to check if an algorithm can tell these different time periods apart from each other as well as I can. After fiddling around with the KNN-means method and the random forest method, I saw that the random forest model performed slightly better overall, so that became the model of choice, the table on the left shows promising results.
First of all, there is a very clear distinction between early miles (1945-1962), transition miles (1963-1968) and late miles (1969-1991), as virtually none of the earlier songs get wrongfully assigned to later periods and vice versa. The transition period is called this because this is a period where Miles’ started distancing himself from traditional harmony and general music theory that evolved from the swing, bop and modal age, as he himself was influenced by the free jazz movement from Ornette Coleman, Cecil Taylor and Eric Dolphy amongst others. The Fusion period is the most distinct of all periods in this table, with very high accuracy. The overall accuracy of the model was around 78%.
The final question in answering whether my categories are true might be answered by performing cluster analysis, but the coming week I have still to do a lot of polishing and adding up to my portfolio (it’s kind of a mess momentarily). Check out the next panel for my predictor analysis.
In this panel, the predictor duration comes out as the very clear winner of the model, which makes a lot of sense. Firstly, recording techniques in the 40ies and 50ies were generally still not perfect and people did not have the means and the tape to record long songs. This all got better during the 50ies however, in which Miles started recording his first 10 minute plus songs (the song All Blues from 1959 for instance is 11 minutes). So apart from Miles’ his own preferences for duration, he was also very heavily limited by the techniques at the time. Still, the duration of his songs gradually increased and culminated during his fusion period (1969-1975), in which jams could last for over 30 minutes on some instances.
The second easily interpretable predictor is acousticness, which already was apparent from my slopegraph, which showed a steady decline in acousticness over the years. Miles was a constant innovator, and the increasing popularity of electric guitars, piano’s, basses and synthesizer in the mainstream also affected him. After 1969, most of his bandmates played electrical instruments. He himself however, never strayed far from his iconic muted trumpet sound.
At the bottom are tempo and the keys, which makes a lot of sense as well. Jazz-culture is jam-culture, each musician is brought up on jam-sessions, in which any musician could call on any tune in any key (and at any tempo). This is reflected in Miles’ music as well, I could perhaps visualize the distribution of tempi and keys to further substantiate this point.
During the building of my model, I culled several predictors, of which almost all keys, the tempo and speechiness variable (since most of the music is instrumental). Furthermore, I incorporated interactions in my random forest model based on the slopegraph in the other panel. I created interactions between energy/acousticness and instrumentalness/energy. All modifications added about 7% to the prediction accuracy, and a random forest performed about 10% better on average than a Knn-means model.
This week I did not have a lot of time to update my portfolio due to other courses, I have processed some data, which updated my plots and provided some new interesting things:
There was a problem in de database in which a lot of the albums did not make my time period subsets (a lot of album release years were NAs for some reason, so I transformed the exact release data from days to years and used that instead), this yielded more data
I have decided to omit the hiatus years of Miles (1975 - 1980), since Spotify only has 1 album of that period, which is a re-release of an earlier period
I have filtered out a lot of compilation albums, remastered releases and later releases based on research to try to conclude original recordings exclusively
I have standardized everything song and made a list of typical songs, I have not yet had the time to analyse these songs
I have added and modified some interpretations to plots
Firstly I have made composite standardized z-scores based on acousticness, danceability, valence, energy and tempo. This allows me to filter out the most typical and atypical songs of each time period. The following is a list of typical songs (z < 0.01)
Most typical songs per time (based on tempo, valence, danceability, acousticness and energy) 40ies: - Half Nelson
50ies: - Boplicity, Birth of the Cool
First Quintet - ‘Round Midnight
Second Quintet: - Dolores
Fusion: - Willie Nelson - Live at Fillmore East, New York, NY - June 20, 1970
Final years: - Freaky Deaky
Looking at the self-similarity matrix, the structure can very clearly be derived by the pattern. I’ve added timestamps in the table to show the individual sections of the song and they align perfectly. The setup is a quintet, with trumpet, piano and saxophone taking solo’s and the rhythm section (piano, bass and drums) accompanying throughout the piece. The reason that this self-similarity matrix works so well is because only one instrument is solo-ing and the accompaniment stays the same. This makes the track very well structured and therefore easy to analyse with SS-matrices.
Have a listen to the song: